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ABSTRACT: 

In this paper different audio denoising techniques are discussed. Most of the audio denoising 
techniques reduce Gaussian white noise from audio signals. Diagonal estimation techniques and non 
diagonal estimation techniques are discussed. Different audio denoising techniques and noises are 
shown through the taxonomy. 
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I. INTRODUCTION 

Audio is corrupted by different types of noise during acquisition of audio. The aim of noise removal 
from audio is to attenuate the noise without modifying the original signal. Various applications of audio 
denoising are music and speech restoration. Diagonal estimation techniques and non diagonal estimation 
techniques are two types of audio denoising techniques. To attenuate the noise from audio signals diagonal time- 
frequency audio denoising algorithms process each spectrogram coefficient independently. The drawback of 
these algorithms are they have a limited performance, denoised signal contains musical noise, denoised sound is 
contaminated and the audio perception is degraded due to the superposition of musical noise. To overcome these 
drawbacks non diagonal estimation techniques are required [6], [7], [11]. 

II. AUDIO DENOISING - RELATED WORKS 

Wavelet based algorithm for audio denoising is discussed in paper [1]. The authors focused on audio 
signals corrupted with white noise. White noise is especially hard to remove because it is located in all 
frequencies. The authors used Discrete Wavelet Transform (DWT) to transform noisy audio signal in wavelet 
domain. It was assumed that signal is represented by high amplitude DWT coefficients and noise is represented 
by low amplitude coefficients. To get audio signal with less noise, thresholding of coefficients are used and they 
are transformed back to time domain. The authors proposed modified universal thresholding of coefficients 
which results with better audio signal. Objective Degree Grade (ODG) was main criterion for evaluation of 
experimental results. The authors have also compared ODG with Mean Square Error (MSE) which is 
widespread used for estimating signal quality. Results show that MSE shows little enhancement or even loss 
while ODG and also informal listening tests prove significant enhancement of signal quality. This denoising 
algorithm worked better for lower noise signals but for higher noise signals higher threshold must be set, but 
except noise part of original signal is also removed by it causing audible artifacts in denoised signal. 

In paper [2], block attenuation methods that were initially applied in orthogonal wavelet signal 
representations [3] is investigated by authors. Block size as well as thresholding level in redundant time- 
frequency signal representations is studied by authors and they found that the remaining noise artifacts in 
restored signals is eliminated by block attenuation and provides a good approximation of the attenuation with 
oracle. A connection between the block attenuation and the decision-directed a priori SNR estimator of Ephraim 
and Malah is studied by authors. An adaptive block technique based on the dyadic CART algorithm [4, 5] is 
introduced by authors. The experiments show that the remaining noise artifacts is eliminated and transients of 
signals are preserved by the proposed method better than the methods which use short-time Fourier do [2]. The 
experiments were performed on speech signals sampled at 11 kHz. These speech signals were corrupted by 
white Gaussian noise. The performance of block attenuation is good when compared with the performance of 
other methods such as Adaptive Block Attenuation with Complex Wavelets, Hard Thresholding with Complex 
Wavelets, Ephraim and Malah decision-directed a priori SNR estimator + Wiener with Complex Wavelets / 
Short-Time Fourier. A number of experiments were performed on various music signals also. 
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The performance of adaptive block attenuation is good when compared with the performance of 
conventional thresholding operators. Sharper note transitions is obtained than the estimate with short -time 
Fourier. However, denoising using short-time Fourier performs better than the wavelet counterpart for the 
stationary parts when high pitch is involved because in high frequency bands short-time Fourier has higher 
frequency resolution than wavelet representation. In paper [8], denoising problem is considered from the 
viewpoint of sparse atomic representation. The authors proposed a general framework of time -frequency soft 
thresholding which encompasses and connects well known shrinkage operators as special cases. Convergence of 
the corresponding algorithms is numerically evaluated and their performance in denoising real life audio signals 
is compared to the results of similar existing approaches. The novel approach is competitive with respect to 
signal to noise ratio and improves the state of the art in terms of perceptual criteria. From the denoising point of 
view the neighborhood weighting could be considered as non diagonal estimation. Musical noise naturally 
arising in diagonal estimation is reduced by these approaches. 

In paper [9], significant improvements in audio denoising is obtained by exploiting the persistence 
properties of signals. In this contribution, a novel denoising operator based on neighborhood smoothed, Wiener 
filter like shrinkage is derived. The purpose of the paper is concerning the operator design and derives a novel 
audio denoising operator, the persistent empirical Wiener estimate, which fuses recent developments in the field 
of structured sparsity with the properties of empirical Wiener filtering. According to a given performance 
criterion a rationale for adaptive threshold selection is proposed. Compared to the optimal thresholds a plain 
linear model depending on the level of the noise achieves minor performance differences. A simple method for 
estimating this noise level in case it is unknown is proposed. The proposed operators perform competitively 
compared to the state of the art, while being much more computationally efficient and robust to minor 
perturbation of the noise level. The method presented in [10] is based on the Singular Value Decomposition 
(SVD) of the frame matrix representing the signal in the Overlap Add decomposition. Both the singular values 
and the singular vectors of the representation are modified to perform denoising. For the former a tapering 
model is used and for the latter a nonlinear PDE method is used. The aim of the proposed technique is to reduce 
additive random noise which has corrupted the signal. To test this method the authors performed tests on a 
variety of sounds from speech and music after corrupting them with additive gaussian noise. The authors used 
the sampling rate 16 kHz for speech and 44.1 kHz for music. The authors compared their method with Savitzky- 
Golay filter in terms of MSE and SNR. Results show that performance of their method is good in reducing noise 
from signal. 

In paper [1 1], the method used is non diagonal in which block parameters are automatically adjusted to 
the nature of the audio signal. This is done by minimizing a Stein estimator of the risk which is calculated 
analytically from noisy signal values. Block thresholding method is used to eliminate musical noise. This 
block thresholding method performs attenuation of time -frequency coefficients after grouping the time- 
frequency coefficients in blocks. In diagonal time -frequency audio denoising algorithms there is lack of time- 
frequency regularity because of which it create isolated time frequency structures. This isolated time frequency 
structures are interpreted as musical noise. Block thresholding is used for audio time frequency denoising which 
regularizes the estimate and musical noise is reduced efficiently. In paper [12], Adaptive time-frequency Block 
Thresholding procedure using discrete wavelet transform is used to reduce the noise from the audio signal and to 
achieve better SNR of the audio signal. For audio signal denoising discrete-wavelet transforms based algorithms 
are used. For denoising both soft thresholding and hard thresholding are used. In the paper the authors compared 
the results of soft thresholding and hard thresholding. Results showed that performance of soft thresholding is 
better than performance of hard thresholding. 

Matching Pursuit (MP) is a greedy algorithm that iteratively builds a sparse signal representation. An 
analysis of Matching Pursuit in the context of audio denoising is presented in the work [13]. The algorithm is 
interpreted as a simple shrinkage approach, the authors identified factors critical to its success and several 
approaches to improve its performance and robustness is proposed. The authors have presented experimental 
results on a wide range of audio signals and shown that the method is able to yield results that are competitive 
with other audio denoising approaches. The authors introduced a new audio denoising approach called Greedy 
Time-Frequency Shrinkage (GTFS) that is able to produce competitive denoising results in terms of standard 
performance metrics, Signal to Noise Ratio (SNR) and Perceptual Evaluation of Audio Quality (PEAQ). The 
authors focused on the removal of uncorrelated Gaussian white noise from music and speech signals. The 
various audio denoising techniques are shown in the taxonomy of figure 1 where MMSE-LSA is Minimum 
Mean Square Error Log Spectral Amplitude Estimation algorithm. Different noises are shown in the taxonomy 
of figure 2. 
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Wiener subtraction 



Non diagonal estimation technique 



Thresholding MMSE-LSA P-point Block 

operators uncertainity model Thresholding 




Figure 1. Audio deuokuig taxonomy 



NOISE 





White noise Gaussian noise Gaussian additive colored noise 

White noise Gaussian noise 

Figure 2. Noise taxonomy 



Performance comparison of Block Thresholding [7], Block Thresholding (BT) with soft thresholding wavelet 
[12], Block Thresholding (BT) with hard thresholding wavelet [12], Minimum Mean Square Error Log 
Spectral Amplitude Estimation algorithm (MMSE-LSA) [7], Minimum Mean Square Error Log Spectral 
Amplitude Estimation algorithm by using Decision Direct method (MMSE-LSA-DD) [7] of Mozart signal for 
different SNR values is shown in the below table 1 . 



Table 1. Performance comparison 



Signal and SNR 


Block 
Thresholding 


BT with soft 
thresholding 
wavelet 


BT with hard 
thresholding 
wavelet 


MMSE-LSA 


LSA-DD 


Mozart 5dB 


14.90 


11.57 


9.S4 


7.625 


7.625 


Mozart lOdB 


IS 31 


15.02 


12 76 


12.625 




Mozart 15dB 


22.03 


1B.75 


15.97 


17.727 




Mozart 20dB 


25.14 


22.64 


19.4B 


22.325 




Mozart 25 dB 


30.29 


27.43 


23.62 


2S.7S5 





From the above performance comparison table 1, we are concluding that the block thresholding technique is 
more efficient than other listed techniques because signal to noise ratio value of block thresholding technique is 
high. 
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III. CONCLUSIONS 

Audio is corrupted by different types of noise during acquisition of audio. The process of removing 
such noise from audio signals is audio denoising. In this paper different audio denoising techniques are 
discussed. From the survey, we are concluding that the non diagonal estimation techniques are efficient 
compared to diagonal estimation techniques as they avoid producing musical noise. 
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